Unsupervised measures for parameter selection of binarization algorithms
نویسندگان
چکیده
In this paper, we propose a mechanism for systematic comparison of the efficacy of unsupervised evaluation methods for parameter selection of binarization algorithms in optical character recognition (OCR). We also analyze these measures statistically and ascertain whether a measure is suitable or not to assess a binarization method. The comparison process is streamlined in several steps. Given an unsupervised measure and a binarization algorithm we: (i) find the best parameter combination for the algorithm in terms of the measure, (ii) use the best binarization of an image on an OCR, and (iii) evaluate the accuracy of the characters detected. We also propose a new unsupervised measure and a statistical test to compare measures based on an intuitive triad of possible results: better, worse or comparable performance. The comparison method and statistical tests can be easily generalized for new measures, binarization algorithms and even other accuracy-driven tasks in image processing. Finally, we perform an extensive comparison of several well known measures, binarization algorithms and OCRs, and use it to show the strengths of the WV measure. & 2010 Elsevier Ltd. All rights reserved.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملTransition thresholds and transition operators for binarization and edge detection
The transition method for image binarization is based on the concept of t-transition pixels, a generalization of edge pixels, and t-transition sets. We introduce a novel unsupervised thresholding for unimodal histograms to estimate the transition sets. We also present dilation and incidence transition operators to refine the transition set. Afterward, we propose the simple edge transition opera...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملUnsupervised Fuzzy Tournament Selection
Tournament selection has been widely used and studied in evolutionary algorithms. The size of tournament is a crucial parameter for this method. It influences on the algorithm convergence, the population diversity and the solution quality. This paper presents a new technique to adjust this parameter dynamically using fuzzy unsupervised learning. The efficiency of the proposed technique is shown...
متن کاملEstimation of Proper Parameter Values for Document Binarization
Most of the existing document-binarization techniques deal with many parameters that require a priori setting of their values. Due to the unknown of the ground-truth images, the evaluation of document binarization techniques is subjective and employs human observers for the estimation of the appropriate parameter values. The selection of the appropriate values for these parameters is crucial an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 44 شماره
صفحات -
تاریخ انتشار 2011